Skip to content

Conversation

@GitHK
Copy link
Contributor

@GitHK GitHK commented Dec 9, 2024

devops ⚠️

Additional steps are required when releasing:

  1. make sure all sidecars were removed
  2. go to Portainer -> Containers
  3. look for containers that are part of service, search for the following dy-sidecar-
  4. remove all of the above containers (regardless of their state running/created/stopped, etc...)

Procedure was applied to

Master:

  • master internal
  • master AWS

Staging:

  • staging DALCO
  • staging AWS

Prod:

  • prod DALCO
  • prod osparc.io
  • prod s4l
  • prod tip internal
  • prod tip AWS

What do these changes do?

When cleaning up all resources used by a new style dynamic service, the director-v2 will now also ask the agent to remove all possible left over containers from the latest run of the service.
Agent searches for all possible containers with a certain prefix that identify a proxy, sidecar or user service for a given node_id. If any container is found, it is removed.

Related issue/s

How to test

Dev-ops checklist

@GitHK GitHK self-assigned this Dec 9, 2024
@codecov
Copy link

codecov bot commented Dec 9, 2024

Codecov Report

Attention: Patch coverage is 82.72727% with 19 lines in your changes missing coverage. Please review.

Project coverage is 86.92%. Comparing base (1ce9f08) to head (e452ef4).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6924      +/-   ##
==========================================
- Coverage   87.11%   86.92%   -0.19%     
==========================================
  Files        1608     1441     -167     
  Lines       63507    59803    -3704     
  Branches     2024     1635     -389     
==========================================
- Hits        55322    51985    -3337     
+ Misses       7851     7547     -304     
+ Partials      334      271      -63     
Flag Coverage Δ
integrationtests 64.87% <100.00%> (-0.05%) ⬇️
unittests 85.11% <82.72%> (-0.69%) ⬇️
Components Coverage Δ
api ∅ <ø> (∅)
pkg_aws_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library 91.36% <100.00%> (+<0.01%) ⬆️
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration 70.02% <ø> (ø)
pkg_service_library 74.29% <0.00%> (-0.24%) ⬇️
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 85.38% <ø> (ø)
agent 96.82% <96.05%> (-0.19%) ⬇️
api_server 90.13% <ø> (ø)
autoscaling 96.09% <ø> (ø)
catalog 90.57% <ø> (ø)
clusters_keeper 99.48% <ø> (ø)
dask_sidecar 91.26% <ø> (ø)
datcore_adapter 93.18% <ø> (ø)
director 76.40% <ø> (ø)
director_v2 91.41% <100.00%> (+<0.01%) ⬆️
dynamic_scheduler 97.03% <ø> (ø)
dynamic_sidecar 89.75% <ø> (ø)
efs_guardian 90.12% <ø> (ø)
invitations 93.44% <ø> (ø)
osparc_gateway_server ∅ <ø> (∅)
payments 92.66% <ø> (ø)
resource_usage_tracker 89.65% <ø> (+0.06%) ⬆️
storage 89.54% <ø> (ø)
webclient ∅ <ø> (∅)
webserver 84.38% <ø> (-0.11%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ce9f08...e452ef4. Read the comment docs.

@GitHK GitHK added this to the Event Horizon milestone Dec 9, 2024
@GitHK GitHK changed the title ♻️ Containers are also removed via agent when the dynamic-sidecar is stopped ♻️ Containers are also removed via agent when the dynamic-sidecar is stopped (⚠️ devops) Dec 9, 2024
@GitHK GitHK marked this pull request as ready for review December 11, 2024 07:39
@GitHK GitHK added a:agent agent service a:director-v2 issue related with the director-v2 service labels Dec 11, 2024
Copy link
Member

@mrnicegyu11 mrnicegyu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks looks good, i have some minor questions. full disclosure: I only read the code not the tests.

@GitHK GitHK requested a review from mrnicegyu11 December 11, 2024 10:37
Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure I understand that change correctly.

So now you will call from the dv-2 on all the agents to remove containers with some UUID correct?

  • when is this call done exactly?
  • will it have any incidence if the call fails in one agent? like a returned exception? will this stop something in the dv-2 from running correctly?
  • will this have an influence on performance? when I start the service anew? (for example in auto-scaled deployments, most probably the dangling container is gone with the machine)

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

@GitHK
Copy link
Contributor Author

GitHK commented Dec 13, 2024

Just to be sure I understand that change correctly.

So now you will call from the dv-2 on all the agents to remove containers with some UUID correct?

  • when is this call done exactly?
  • will it have any incidence if the call fails in one agent? like a returned exception? will this stop something in the dv-2 from running correctly?
  • will this have an influence on performance? when I start the service anew? (for example in auto-scaled deployments, most probably the dangling container is gone with the machine)

Not precisely. This works as follows:

  • to remove a container we call a specific agent (the one that is running on the node where the service was started) and ask it to remove the service.
  • the above is achieved by injecting inside the RPC method name the docker_node_id
  • in case of issues an error will be raised (could not reach agent, timeout etc...)
  • there are no influences on performance. (this is also ran when closing the service)

Copy link
Member

@mrnicegyu11 mrnicegyu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We finally resolved the disagreements in person, thanks a lot and good for me :--)

@GitHK GitHK requested review from pcrespov and sanderegg December 16, 2024 08:40
@GitHK GitHK enabled auto-merge (squash) December 16, 2024 13:42
Andrei Neagu added 3 commits December 16, 2024 16:02
@sonarqubecloud
Copy link

@pcrespov pcrespov disabled auto-merge December 17, 2024 09:22
@pcrespov pcrespov merged commit 75aed81 into ITISFoundation:master Dec 17, 2024
87 of 91 checks passed
@GitHK GitHK deleted the pr-osparc-orphaned-containers-removal branch December 17, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a:agent agent service a:director-v2 issue related with the director-v2 service

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants